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ACETYL-COA-CARBOXYLASE FROM CANDIDA ALBICANS 

The present invention relates to Acetyl-COA-carboxylase (ACCase) genes from 
Candida Albicans (C. albicans) and methods for its expression. The invention also relates to 
5 novel hybrid organisms for use in such expression methods. 

C. albicans is an important fungal pathogen and the most prominent target organism for 
antifungal research. ACCase is an enzyme of fatty acid biosynthesis and essential for fungal 
growth and viability. Inhibitors of the ACCase enzyme should therefore be potent antifungals. 
The ACCase proteins in all organisms are homologous to each other but they also differ 
1 0 significantly in the amino acid sequence. Because selectivity problems (for example fungal 
versus human) it is extremely important to optimise potential inhibitor leads directly against the 
target enzyme (C. albicans) and not against a homologous but non-identical model protein, for 
example from Saccharomyces cerevisiae (S. Cerevisiae). 

We have now successfully cloned the ACCase gene from C. albicans (hereinafter 
15 referred to as the C. Albicans ACC1 gene) and elucidated its full length DNA sequence and 
corresponding polypeptide sequence, as set out in Figures 4 and 5 of this application 
respectively. The coding DNA sequence of the C. Albicans ACC1 gene is 6810 nucleotides in 
length and the corresponding protein sequence is 2270 amino acids in length. As will be 
explained below there are two forms of the C. Albicans ACC1 gene, the above numbers relate 
20 to the longer version, Metl . 

Therefore in a first aspect of the present invention we provide a polynucleotide 
encoding a C.albicans ACCase gene, in particular the (purified) C. albicans ACC1 gene as set 
out in Figure 5 hereinafter. It will be appreciated that the polynucleotide may comprise any of 
the degenerate codes for a particular amino acid including the use of rare codons. The 
25 polynucleotide is conveniently as set out in Figure 4. It will be apparent from Figure 4 that the 
gene is characterised by the start codons Metl and Met2 (as indicated by the first and second 
underlined atg codons, hereinafter referred to as atgl and atg2 respectively). Both forms of the 
gene starting from Metl and Met2 respectively are comprised in the present invention. The 
invention further comprises convenient fragments of any one of the above sequences. 
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Convenient fragments may be defined by restriction endonuclease digests of sequence, suitable 
fragments include a full length C. Albicans ACC1 gene (starting with Metl or Met2) flanked by 
unique StuI (5'-end)-NotI (3'-end) restriction sites as detailed in Figure 6. 

We also provide a polynucleotide probe comprising any one of the above sequences or 
5 fragments together with a convenient label or marker, preferably a non-radioactive label or 
marker. Following procedures well known in the art, the probe may be used to identify 
corresponding nucleic acid sequences. Such sequences may be comprised in libraries, such as 
cDNA libraries. We also provide RNA transcripts corresponding to any of the above C. 
Albicans ACC1 sequences or fragments. 
10 In a further aspect of the invention we provide a C. albicans ACC 1 enzyme, especially 

the ACC1 enzyme having the polypeptide sequence set out in Figure 5, in isolated and purified 
form. This is conveniently achieved by expression of the coding DNA sequence of the C. 
Albicans ACC1 gene set out in Figure 4, using methods well known in the art (for example as 
described in the Maniatis cloning manual - Molecular Cloning: A Laboratory Manual, 2 nd 
15 Edition 1989, J. Sambrook, E.F. Fritsch & Maniatis). As indicated for Figure 4 above, the 
enzyme is characterised by two forms Metl and Met2. Both form of the enzyme are comprised 
in the present invention. 

The C. Albicans ACC1 enzyme of the present invention is useful as a target in 
biochemical assays. However, to provide sufficient enzyme for a biochemical assay for C. 
20 Albicans ACC 1 (for example, for a high throughput screen for enzyme inhibitors) this has to be 
purified. Two major constraints impair this purification. 

1) any new organism will necessitate deviation from published procedures because it will differ 
in its lysis and protease activity. C. albicans is known to express and secrete many aspartyl 
proteases. 

25 2) The expression of C. Albicans ACC 1 is very low and satisfying purification results can only 
be achieved if the enzyme is overexpressed. 

We have now been able to overcome these problems by controlled overexpression of the 
C. albicans ACC1 in a Saccharomyces strain. This means that subsequent purification of the 
enzyme may then for example follow published procedures. 
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Therefore in a further aspect of the present invention we provide a novel expression 
system for expression of a C. albicans ACC1 gene which system comprises an S. cerevisiae 
host strain having a C. albicans ACC1 gene inserted in place of the native ACC1 gene from S. 
Cerevisiae, whereby the C. albicans ACC1 gene is expressed. Preferred S. cerevisiae strains 
5 include.JK9-3Daa and its haploid segregants. 

The C. albicans ACC1 gene is preferably over-expressed relative to that as may be 
achieved by a C. albicans wild type strain, ie under the control of its own ACC1 promoter. 
Whilst we do not wish to be bound by theoretical considerations, we have achieved 
approximately 14 fold over-expression relative to the wild-type host S. cervisiae strain JK9-3D. 
1 0 This may be achieved by replacing the C. albicans promoter in the expression construct by a 
stronger and preferably inducible promoter such as the S. cerevisiae GAL 1 promoter. 

Controlled overexpression is used to improve expression of a C. albicans polypeptide 
relative to expression under the control of a C. albicans promoter. In addition using procedures 
outlined in the accompanying examples we have been able to isolate a fully functional C. 
15 albicans ACC1 gene as determined by 100% inhibition by SoraphenA. 

The novel expression system is conveniently prepared by transformation of a 
heterozygous ACC1 deletion strain of a convenient S. cerevisiae host by a convenient plasmid 
comprising the C. albicans ACC1 gene. Transformation is conveniently effected using methods 
well known in the art of molecular biology (Ito et al. 1983). 
20 The plasmid comprising the C. albicans ACC1 gene and used to transform a convenient 

S. cerevisiae host represents a further aspect of the invention. Preferred plasmids for insertion 
of the C. Albicans ACC1 gene include YEp24, pRS316 and pYES2(Invitrogen). 

The heterozygous ACC1 deletion strain of a convenient (diploid) S. cerevisiae host is 
conveniently achieved by disruption preferably using an antibiotic resistance cassette such as 
25 the kanamycin resistance cassette described by Wach et al (Yeast, 1994, JO, 1 793-1 808). 

The expression systems of the invention may be used together with, for example cell 
growth and enzyme isolation procedures identical to or analogous to those described herein, to 
provide an acetyl-COA-carboxylase (ACCase) gene from C.albicans in sufficient quantity and 
with sufficient activity for compound screening purposes. 
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In a further aspect of the invention we provide the use of an acetyl-COA-carboxylase 
(ACCase) gene from C.albicans in assays to identify inhibitors of the polypeptide. In particular 
we provide the their use in pharmaceutical or agrochemical research. 

As presented above the C. albicans ACC1 enzyme may be used in biochemical assays to 
5 identify agents which modulate the activity of the enzyme. The design and implementation of 
such assays will be evident to the biochemist of ordinary skill. The enzyme may be used to 
turn over a convenient substrate whilst incorporating/losing a labelled component to define a 
test system. Test compounds are then introduced into the test system and measurements made 
to determine their effect on enzyme activity. Particular assays are those used to identify 
1 0 inhibitors of the enzyme useful as antifungal agents. By way of non-limiting example, the 
activity of the ACC1 enzyme may be determined by (i) following the incorporation (HC0 3 , 
Acetyl-CoA) or loss (ATP) of a convenient label from the relevant substrate (T.Tanabe et al, 
Methods in Enzymology, 1981, 71, 5-60; M. Matasuhashi, Methods in Enzymology, 1969, 14, 
3-16), (ii) following the release of inorganic phosphate from ATP (P. Lanzetta et al, Anal. 
15 Biochem. 1979, 100, 95-97), or (iii) following the oxidation of NADH in a coupled assay, for 
example using either fatty acid synthetase or pyruvate kinase/lactate dehydrogenase enzymes. 
Convenient labels include carbonl4, tritium, phosphorous32 or 33. 

Any convenient test compound(s) or library of test compounds may be used. Particular 
test compounds include low molecular weight chemical compounds (molecular weight less than 
20 1 500 daltons) suitable as pharmaceutical agents for human, animal or plant use. 

The enzyme of the invention, and convenient fragments thereof may be used to raise 
antibodies. Such antibodies have a number of uses which will be evident to the molecular 
biologist of ordinary skill. Such uses include (i) monitoring enzyme expression, (ii) the 
development of assays to measure enzyme activity and precipitation of the enzyme. 
25 In addition we provide antisense polynucleotides specific for all or a part of an ACC 1 

polynucleotide of the invention. 

The invention will now be illustrated but not limited by reference to the following 
Table, Example, References and Figures wherein: 
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Table 1 shows the comparative properties of native and recombinant acetyl-CoA 
carboxylase enzymes 

Figurei shows partial sequence from the C. albicans genome. Underlined regions 
5 were used to derive PCR primers, to generate a C. albicans ACC 1 specific probe. 

Figure2 shows cloned fragments of the C. albicans ACC1 gene isolated from genomic 
DNA libraries. Arrows indicate extension of the fragment beyond the region displayed. 

Figurei shows sequenced Xbal-HinDIII and HinDIII subclones of clone CLSl-bl . 
Figure4 shows the full DNA sequence of the C. albicans ACC1 gene. The atg start 
1 0 codons for Met 1 and Met 2 are in lower case and underlined, as is the tap stop codon 

fjgurci shows the full protein sequence of the C. albicans ACC1 gene. Putative start 
codons for Met 1 and Met2 are shown in bold. 

F 'K ure 6 shows the generation of a tailored ACC 1 gene (minus promoter) for 
expression under control of the GAL1 promoter in plasmid pYES2. From the initial ACCase 
15 gene (linel ) the core SacI-BamHI (line3) is modified by the addition of 3' BamHI-NotI 
(line2) and 5 ' StuI-SacI (different fragments for Metl and -2 lines 5 and 7 respectively) to 
generate the final "portable" gene flanked by StuI-NotI (lines 6 and 8). 

Figure 7 show s the results of the in-vitro ACCase enzyme assay set out in the 
accompanying Example when Soraphen A (a specific inhibitor of the ACCase enzyme) was 
20 supplied (X-axis) over the range O.lnM-lOOuM in the dose response regimen of the assay. 

Example 1 

Cloning of the C. albicans ACC1 gene and generation of a heterologous S. cerevisiae 
expression system: 

25 

1) Probe generation 

We used the polymerase chain reaction (PCR) to generate a DNA probe between and 
including the underlined regions in Figure 1 
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2) Identification of clones from a C. albicans genomic library hybridising to the ACCase 
probe 

The PCR product was labelled using an "ECL direct nucleic acid labelling and 
detection kit" (Amersham) as described by the supplier. The PCR product (probe) was then 
5 shown to hybridise to S. cerevisiae (weakly) and C. albicans genomic DNA. in a Southern 
blot procedure (as described Maniatis, 1989). Two genomic DNA libraries (CLS1 and CLS2) 
of C. albicans (in the yeast-E. coli shuttle plasmids YEp24 and pRS3 16 respectively, (as 
described in.Sherlock et al. 1994, source: Prof. John Rosamond, Manchester University) were 
used to isolate fragments hybridising with the probe which was radiolabeled using "Ready To 
10 Go" dCTP labelling beads (Pharmacia, as described by the manufacturer). The colony 
hybridisation was carried out as described by Maniatis (1989). Hybridising colonies were 
identified, plasmid DNA isolated, purified (Quiagen maxiprep, as described by the supplier) 
and sequenced (Applied Biosystems, model 377 sqeuencer) from their junctions with the 
plasmid. Several fragments carrying partial ACCase gene sequence as well as one full length 
1 5 clone could be identified (Figure 2). 

3) Sequencing of the cloned gene, comparison with ACCases from S. cerevisiae, other 
fungi and higher eukaryotes (plants, mammals, man) 

The bulk of the sequence of the C. albicans ACC1 gene was determined (on both 
20 strands) using flanking sequence- or insert sequence-specific primers from defined HinDIII 
and Xbal-HinDIII subfragments (of clone CLSl-bl) cloned into pUC19 (see Figure 4). The 
promoter and 5' coding region absent from this clone was established from CLS2-dl and the 
gene's 3' end from CLS2-13 using insert specific primers. All junctions including the ones 
between the HinDIII subfragments were verified from the full length clone CLS2-13 (in 
25 Yep24. The full length DNA sequence of C. albicans (Ca) ACC1 is shown in Figure 5a and 
the protein translation in Figure 5b. The two potential start Methionines, Metl and Met2 are 
shown in bold 

The protein is homologous to ACCases of other fungi (S. cerevisiae, S. pombe and 
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U maydis) and also to the plant (Brassica napus), mammalian (sheep, chicken and rat) and 
human enzymes. Of the two potential start codons of C. albicans ACC1, Met 2 seems the 
more likely one as the sequence between Metl and Met2 is unrelated to the other ACCases 
and indeed to any other protein sequence in the EMBL/Genbank database. The high degree of 
5 homology between ACCases of different species and the apparent lack of an identifiable 
fungal subgroup makes it even more important to use the actual target enzyme (here from the 
pathogen C. albicans) as a screening tool to identify specific inhibitors. 

4) Generation of a heterozygous ACCl deletion strain of S. cerevisiae 

1 0 As ACCase is an essential enzyme, only one allele of a diploid cell can be deleted 

without loss of survival. One ACCl gene of a diploid S. cerevisiae strain (JK9-3Daa, Kunz et 
al. 1993) was therefore disrupted using the kanamycin resistance cassette as described by 
Wach et al. using the protocol described therein. Sporulation of the heterozygous diploid 
(ACCl/accl ::KANMx) yields only two viable spores (which are kanamycin-sensitive) 

15 showing the essentiality of the ACCl gene as well as the characteristic arrest phenotype for 
the two inviable spores (as published by HaBlacher et al., 1993). 

5) Complementation of a S. cerevisiae ACCl deletion with the cloned Candida gene, 
CaACCl 

20 The heterozygous ACCl/accl ::KANMx strain was transformed with one full length C. 

albicans gene (CLS2-13 in Yep24). Expression of the gene from this plasmid will be due to 
functionality of the Candida ACCl promoter in the heterologous S. cerevisiae system. 
Complementation of the knockout was demonstrated by sporulating the diploid transformants. 
In most cases 3-4 viable (haploid) spores were detected. The analysis of tetrads indicated that 

25 kanamycin-resistant colonies were only formed if they also contained the complementing 
CLS2-1 3 plasmid, as indicated by the presence of the URA3 transformation marker. This 
clearly shows that the C. albicans gene fully complements the ACCase function in S. 
cerevisiae. Therefore the strain generated can be used to screen for inhibitors which are 
specific for the Candida enzyme in the absence of a background of Saccharomyces enzyme. 
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As demonstrated by its functionality, the heterologous protein folds correctly in the host, S. 
cerevisiae, where it must also have been correctly biotinylated by the S. cerevisiae machinery 
(carried out by ACC2, encoding protein-biotin-ligase). 

To facilititate purification of C. albicans ACCase, it is beneficial to achieve 
5 overexpression of the protein in a suitable host. Therefore the C. albicans promoter was 
replaced by the stronger and inducible S. cerevisiae GAL1 promoter. As the Candida 
sequence had revealed two potential start codons (see Figure 4) for the ACC1 reading frame, 
both versions were placed under GAL1 control. To generate appropriate restriction sites for 
cloning, the ACC1 gene was modified via PCR at both ends (see Figure 6 above), and cloned 

10 into plasmid pYES2 (Invitrogen) as a StuI-NotI fragment into HinDIII (fill-in)-Notl sites of 
the vector. The identity of the PCR-modified gene-parts with the original ones was confirmed 
by sequencing. Both constructs (Metl and Met2) complement the S. cerevisiae ACC1 
knockout when the cells are grown on galactose but not on glucose (where the GAL1 
promoter is switched off). Growth is very poor if the gene is transcribed initiating at Metl, 

15 whereas Met2 restores wild type growth rates in S.cerevisiae. 

6) Overexpression of the Ca ACCase to facilitate protein purification and use for 
screening purposes 



20 Materials 

Growth Media :- 

Sabouraud Dextrose broth 

Yeast peptone dextrose broth (YPD) 

Yeast peptone galactose broth (YPGal) (i.e. 2% w/v galactose) 

25 

Growth of cells 

Candida albicans B2630 (Janssen Pharmaceutics Beerse, Belgium) was maintained 
on Sabouraud dextrose agar slopes at 37 ^C which were subcultured biweekly. For the 
growth of liquid cultures for experiments, C. albicans grown on Sabourauds dextrose agar for 
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48 h at 370C was used to inoculate 50 ml Sabouraud dextrose broth containing 500ug/l d- 
biotin. This was incubated for 16 h at 37 °C on a platform shaker (150 rpm). 1.5 ml of this 
culture was added to each of 24 x2 litre conical flasks, each containing 1 litre of Sabouraud 
dextrose broth containing 500ug/l d-biotin, giving a final inoculum concentration of 
5 approximately 1.5x10° cfu ml". The cultures were grown for 9 h, at 37 0 C (log phase) with 
shaking (150 rpm). Cell numbers in liquid culture were determined spectrophotometrically 
(Philips PU8630 UV/VIS/NIR Spectrophotometer) at 540 nm in a 1 cm path length cuvette. 
Absorbance was linearly related to cell number up to an OD. of 2.0. 

Saccharomyces cerevisiae strains Meyl34 and CLS2-13 were maintained on Yeast 
10 peptone dextrose (YPD) agar plates at 30 <>C, which were subcultured biweekly. For the 
growth of liquid cultures for experiments, the S. cerevisiae strains were grown on YPD agar 
for 48 h at 30 °c and were then used to inoculate 50 ml YPD broth containing 500ug/l d- 
biotin, which was incubated at 30<>C for 16h on a platform shaker (200 rpm). 2.0 ml of this 
culture (approx. 4 x 10 8 cfu/ml) was added to each of 24 x 2 litre conical flasks, each 
15 containing 1 litre of YPD broth containing 500ug/l d-biotin, giving a final inoculum 

concentration of approximately 8 xl05 cfu/ml. The cultures were grown for 9 h, at 30 <>C (log 
phase) with shaking (200 rpm). Cell numbers in liquid culture were determined 
spectrophotometrically (Philips PU8630 UV/VIS/NIR Spectrophotometer) at 540 nm in a 1 
cm path length cuvette. 

20 Saccharomyces cerevisiae strains PNS 1 1 7a 5C, PNS 1 1 7b 6A, and PNS 120a 6C were 

maintained on Yeast peptone galactose (YPGal) agar plates at 30 0 C which were subcultured 
biweekly. For the growth of liquid cultures for experiments, the S. cerevisiae strains were 
grown on YPGal agar for 48 h at 30 °C and were then used to inoculate 50 ml YPGal broth 
containing 500ug/l d-biotin and 200ug/ml kanomycin, which were incubated at 300c for 30h 

25 on a platform shaker (200 rpm). 2.0 ml of this culture (approx. 4 x 108 cfu/ml) was added to 
each of 24 x2 litre conical flasks, each containing 1 litre of YPGal broth containing 500ug/l 
d-biotin and 200ug/ml kanomycin, giving a final inoculum concentration of approximately 8 



5DOCID: <WO 9932635A1_I_> 



5 



WO 99/32635 PCT/GB98/03857 

x 1 0 5 cfii/ml. The cultures were grown for approximately 23h at 30 °C (log phase) with 
shaking (200 rpm). 

Determination of cell number 

Cell numbers were determined using a standard viable count agar based plating 
method, using the appropriate agar media. 



Preparation of fungal ACCase enzyme 

Cultures of the appropriate yeast strains were grown to the exponential phase of 
1 0 growth (for Saccharomyces and Candida strains respectively). These were then harvested 
by centrifugation (4400 g, lOmin, 4 <>C), washed twice in 700ml of 50mM Tris pH7.5 
containing 20% w/v gylcerol, resuspending the cell pellet each time. The final washed pellet 
was fully resuspended into a thick slurry using 10 to 20ml of buffer (50mM Tris pH7.5 
containing ImM EGTA, ImM EDTA (disodium salt), ImM DTT, 0.25mM Pefabloc 
1 5 hydrochloride, 1 uM Leupeptin hemisulphate, 1 uM Pepstatin A, O.SuM Trypsin inhibitor 
and 20% w/v glycerol). The volume of buffer required was dependent on the total packed 
cell wet weight, (i.e. 1ml buffer added per 6gm of packed wet cell pellet). 

The cell paste was homogenised using a pre-cooled Bead-Beater (Biospec 
Products,Bartlesville, OK 74005) with 4x10 second Bursts, allowing 20 second intervals on 
20 ice. The preparation was then centrifuged at 31,180g for 30 minutes. After centrifugation 
the supernatant was immediately decanted into a container, then aliquoted before snap 
freezing in liquid nitrogen. The preparation was then stored at -80°C and was found to be 
stable for at least 2 months. 

All enzyme preparation steps were carried out at +4°C, unless otherwise stated. 

25 

In-vitro ACCase enzvme assay 

The assay was conducted in 96 well, flat bottomed polystyrene microtitre plates. All 
test and control samples were tested in duplicate in this assay. 
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1 OOul of the ACCase enzyme preparation (in 50mM Tris pH7.5 containing 1 mM 
EGTA, ImM EDTA (disodium salt), ImM DTT, 0.25mM Pefabloc hydrochloride, luM 
Leupeptin hemisulphate, luM Pepstatin A, 0.5uM Trypsin inhibitor, and 20% w/v glycerol) 
was added to each well of the microtitre plate. Each well contained either a 3ul test sample 
5 made up in DMSO or 3ul DMSO alone (NB. Final DMSO concentrations in the assay were 
1 .48% v/v). The microtitre plates were placed in a water bath maintained at 37°C. 1 Oul of 
[ 14 C] NaHC0 3 containing 9.25kBq in 378mM NaHC0 3 was then added to each well. The 
reaction was initiated by the addition of 1 OOul of Acetyl Coenzyme A containing assay 
buffer (50mM Tris pH7.5 containing 4.41mM ATP(disodium salt), 2. ImM Acetyl 
10 Coenzyme A, 2.52mM DTT, 10.5mM MgCl 2 , and 0.21% w/v Albumin [Bovine, fraction 
V]), (removed from ice 5 minutes before use) to each well. The tubes were incubated at 37°C 
for 5 minutes. The reaction was then terminated by the addition of 50ul of 6M HC1 to each 
well. In parallel, a pre-stopped assay control was set up which involved adding the SOul of 
6M HC1 prior to [ ,4 C] NaHCO, and the assay buffer (No further HC1 additions were made to 
1 5 these wells after the 5 minute incubation). The DPM values for the pre-stopped assay were 
subtracted from the normal assay situation. 

After the addition of the stop reagent the plates were left open in the water bath for a 
further 30 minutes to allow the ,4 C0 2 to escape. After this time 150ul of each reaction 
mixture were applied onto individual GF/C glass microfibre filter discs and allowed to dry 
20 thoroughly before adding scintillation fluid. Radioactivity in the samples was then 
determined by scintillation counting (Wallac WinSpectral 1414, Turku, Finland). 

IC50's were calculated from the data using non-linear regression techniques 
available in the ORIGIN software package (Microcal Software Inc., Massachusetts, USA). 

Soraphen A which is a specific inhibitor of ACCase was supplied over the range 
25 0. InM-lOOuM in the dose response regimen of the assay. 
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Protein determination 

The total protein concentration of each ACCase preparation used was determined by 
the Coomasie Blue method (Pierce, Illinois, USA), (using 1cm path length cuvettes read 
595nm (Philips PU8630 UV/VIS/NIR Spectrophotometer). 

5 

In-vitro antifungal activity 

Compounds were tested over a concentration range of 1024 - 0.00098 ug/ml by a 
broth-dilution method in microtitre plates using doubling dilutions in YPD or YPGal (both 
containing 500ug/l d-biotin). Stock solutions of inhibitors were prepared at 5 1 .2mg/ml in 
10 Dimethyl sulphoxide (DMSO) (final assay concentration of DMSO was 2% v/v). Each 

Yeast culture was added to the well to give a final 10 4 cfu/well. The plates were incubated at 
30°C for 48h and MIC's determined visually. 

Discussion 

1 5 Expression of ACCase, a biotinylated protein, was monitored by a "biotin-avidin 

affinity western blot" as described by Hafilacher et al., 1993. Expression of the C. albicans 
ACC1 gene from its own promoter from plasmid Yep24 was comparable to that of the S. 
cerevisiae gene (no overexpression). Expression under control of the GAL1 promoter 
however, was considerably higher indicating a drastically increased level of biotinylated and 
20 therefore fully functional enzyme. Transcription of the gene was fully induced as the cells 
had to be grown on galactose to be viable. On glucose the GAL1 promoter is completely off, 
causing the cells to arrest and eventually die due to insufficient supply of ACCase). The S. 
cerevisiae strain described in this application is a convenient source of the C. albicans 
enzyme. The engineered strain possesses no residual background ACCase because the gene 
25 coding for the S. cerevisiae enzyme had been removed. Congenic versions of such a strain 
(genetically identical apart from the ACCase gene carried) expressing different ACCases 
(e.g. the different human (Abu-Elheiga et al. 1995), mammalian (Lopez-Casillas et al., 1988, 
Takai etal. 1988, Barber etal., 1995)), plant (Schulte et al., 1994) or other fungal enzymes 
(Al-Feel et al., 1992, Saito et al., 1996, Bailey et al., 1995) ) can be used as tools for 
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screening. Differences in growth of such strains may be solely dependent on differences in 
their ACCase activity. Differential growth in the presence of ACCase inhibitors (for 
example soraphenA or compounds yet to be identified) indicates selectivity of the drug 
towards one type of the ACCase enzyme. 

5 

References : 

Abu-Elheiga L., Jayakumar A., Baldini A., Chirala S.S., Wakil S.J.; Proc. Natl. Acad. Sci. 
U.S.A. 92: 401 1-4015(1995). 

Al-Feel W., Chirala S.S., Wakil S.J.; Proc. Natl. Acad. Sci. U.S.A. 89:4534-4538(1992). 
10 Bailey A.M, Keon J.P.R., Owen J., Hargreaves J.A.; Mol. Gen. Genet. 249:191-201(1995). 
Barber M.C., Travers M.T.; Gene 154:271-275(1995). 

HaBlacher M., Ivessa A. S., Paltauf F., Kohlwein S. D.; J. Biol. Chem. 268 :i 0946- 10952 
(1988). 

Ito, H., Fukuda, Y., Murata, K., Kimura, A.; J. Bacteriol. 153: 163-168 (1983) 
15 Kunz. J., Henriquez, R., Schneider, U., Deuter-Reinhard, M., Mowa, NR., Hall, M.N.; 
Cell 73: 585-596 (1993) 

Lopez-Casillas F., Bai D.-H., Luo X., Kong I.-S., Hermodson M.A., Kim K.-H.; Proc. Natl. 
Acad. Sci. U.S.A. 85:5784-5788(1988). 

Maniatis T., Frisch E. F., Sambrook J.; Molecular Cloning, Cold Spring Harbour Laboratory 
20 Press (1989) . 

Saiki R. K, Gelfand D. H., Stoffel S., Sharf S. J., Higuchi R., Horn G. T., Mullis K. B., 
Erlich H. A.; Science 239: 487-494 ( 1 988) 

Saito A., Kazuta Y., Toh H., Kondo H., Tanabe T.; S. pombe ACC1, Submitted (Dec-1996) 
to Embl/Genbank/Ddbj Data Banks. 
25 Schulte W., Schell J., Toepfer R.; Plant Physiol. 1 06:793-794(1 994). 

Sherlock G, Bahman A. M., Mahal A., Shieh J. C, Fewrreira M., Rosamond J.; Mol. Gen. 
Genet. 245: 716-723. 

Takai T., Yokoyama C, Wada K., Tanabe T.; J. Biol. Chem. 263 :265 1-2657(1988). 
Wach A., Brachat A., Poehlmann R, Philippseen P.; Yeast 1 0: 1 793-1 808 (1 994) 



9932635A1 I > 



WO 99/32635 




PCT/GB98/03857 



OS 

< 



E 
c 

CU 
CJ 

M I- 







0> 


a 


ific 


ctivity 


Cas 


atio 


pec 


AC 


par 


00 


03 


o 


pre 




vq o 
o 



CN 

o 



co 

tO 



on 



2 o 



Q 

2 



CO 

- . , X CO JO . M 

2s ^ C CO 

o « < & 



k. 

«r 

< 

© 

I 

« 

c 

IS 

E 

© 

■a 
c 
« 

> 

03 
B 



O 

to 

U 



"3 



ON 00 

CN £ 



w o 
oo 



ico 
o 
o 



oo 



CN 



CN 



£ 
a 



(L) <D — 
3 3 J 



cd o 
ex 



<2 a 



In 



o 



O 
co 



<=? 9 

co ^ 



o 

co 



O 
co 



Q 

2 



CN 



© 

* 

C 
© 



4> 



op 1? 
o 35 £ 2 
■8 E 



O cn 



co 
to 



CN 
CN 
CN 



co 

o oo 

CO CN 



2 



E 

© 

V 



c 



t/5 
CO 



§ 



I o 



2! 



« CO 
. CN 



00 

cu 

.8 



.8 

5 CO 

6 CN £ 



oo 
2 
cu 



00 

g 

<3 



CO 



52 -S3 -S 
^ ^ ^ ^ 

^.to j^vo sjvo c ~ 



T3 

ii .s 
la 

2 (u 
*J o 



SDOCID: <WO 993263SA1 J_> 



WO 99/32635 PCT/G B98/03857 

^ -15- 

Claims; 

1 . A polynucleotide encoding an Acetyl-COA-carboxylase (ACCase) gene from Candida 
albicans. 

5 

2. A polynucleotide as claimed in claim 1 and as set out in Figure 4 herein. 

3 . A polynucleotide as claimed in claim 2 and characterised by the start codon atg2. 

10 4. A polynucleotide comprising a restriction fragment of a polynucleotide as claimed in 
any one of claims 1 -3. 

5. A polynucleotide probe comprising a polynucleotide as claimed in any one of claims 
1-4. 

15 

6. An Acetyl-COA-carboxylase (ACCase) polypeptide from Candida albicans in isolated 
and purified form. 

7. A polypeptide as claimed in claim 6 and as set out in Figure 5. 

20 

8. A polypeptide as claimed in claim 7 and characterised by Met2. 

9. A polypeptide as claimed in claim 6 and obtained by expression of a polynucleotide as 
claimed in any one of claims 1-4. 

25 

1 0. Antibodies specific for a polypeptide as claimed in any one of claims 6-9. 

11. An antisense polynucleotide specific for all or a part of a polynucleotide as claimed in 
any one of claims 1-4. 

30 
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12. An RNA transcript corresponding to a polynucleotide as claimed in any one of claims 



13. An expression system for expression of an Acetyl-COA-carboxylase (ACCase) 
5 polypeptide from Candida albicans which system comprises an S. cerevisiae host strain 
having a Candida albicans ACC1 polynucleotide as claimed in any one of claims 1-3, 
inserted in place of the native ACC1 gene from S. Cerevisiae, whereby the Candida albicans 
ACC1 polypeptide is expressed. 

10 14. An expression system as claimed in claim 13 and adapted for controlled 

overexpresssion of the Candida albicans polynucleotide relative to expression under the 
control of a Candida albicans promoter 

15. An expression system as claimed in claim 14 and used to provide an Acetyl-COA- 
15 carboxylase (ACCase) gene from Candida albicans in sufficient quantity and with sufficient 

activity for compound screening purposes. 

16. Use of an Acetyl-COA-carboxylase (ACCase) polypeptide from Candida albicans as 
claimed in claim 6, in an assay to identify inhibitors of the polypeptide. 



1-4. 



17. Use as claimed in claim 16 in pharmaceutical research. 
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FIGURE 1 



GCACGCTTGACGGTTTTCACCAAATGCGAAAATATGACCAAATTGAGAATCCGAAAATGA 
ATGGATAGAAGATTGGTTACCAACTGAGAAATAACCCCACACATTAGAAGAAGAACGGAA 
ATTCAATTCATGTAAAGAACCACCACTTGGTTTAAAACCTTCACCAGGATCTTCAGAAGT 
AATACGACAAGCAGTACAATGTCCCTTTGGTGTTGGTCTTCTTTGACTAACCAATGAAGT 
TTCTGACTTGAATTCAAAATCAATATCAGTAGTGGTATGAGGATCGGCACCGCACAAAGT 
7CTGATATCTCTGATTCTATGCATTGGTATACCCATAGCAATTTGTAATTGAGCAGCTGG 
TAAATTAACACCTGTCACCATTTCAGTGGTTGGATGTTCAACTTGCAATCTTGGGTTCAA 
TTCCAAAAAGTAGAATTTATCTTCAGCGTGGGGAGTAAAGGTACTCAACAGTACCAGGGG 
GTTACATAACCAACTTATTTTACCCAATCTGACTGGTGGATTT 



9932635A1_I_> 



WO 99/32635 



2/8 
FIGURE 2 



PCT/GB98/03857 



-CLS2-d1- 



-CLS2-13- 



-CLS1-b1- 



ACC1 



SDOCID: <WO 993263SA1_I_> 




Hindlll 



9932635A1J_> 



WO 99/32635 Jff| PCT/GB98/03857 

4/8 
FIGURE 4 



AATATATTGCTTCCTTTTGATAGGAAGTAACTCCGAGTGTTTGAATTTGATATATGTTATTCATATACGTTCAATGGCTC 
TCTTCTATGCTTTGTATATACTTTCTTTTGAATAGATACTCATGTAAAGAGATTTGAAACCATATTCTAACCAACAAAAA 
3 TATTGTACGGTATAGGTTAGAAAAAAAACTCCGTAAGGTCCGCTTACACGGTTAAATTGAAAACACGTTAAAAATATATT 
TGGGTAATGGACTAAGCTATATACAGTACTCAACAAAAATGAAATCAAACACAATGTTCTTTGGGAAATTCATTTCATGC 
AACTAGGGTGATTCTCTTTCTACTATCCAACAACGATAACCCTGCTTTTGAAAAATCTTTTCTAAATTCAAATTGATATA 
ATTCTTATTTATATATTACTTTCTTTTTCCCATATAACCCCATTTTTTTTTTGGAATCATATTTGTTTTTGATTTTTGCT 
TTCCCTTTCAGTCTGAGGAACATACTAATTACGAACAACAATTATACATCCAATCTTCATCTAACGAATTGATTATTTAC 
1U ATTTATTAAACCCTTGGATACAAACTGATTACACTTTTTAGTTAGTTTGTTCAATTATAAGGGTATTATACAACAAAGAT 
ATCA7TTAAAGTTAAATCTCAATCTGGAATAATAAAAGTATTCAACACTTTTGCTTACAATAGGTATGTTCAAAATCAAT 
TGAAGCC AT CG AG ATAAGAAATTAAGCAAAAACGTTTACAATTGTTGTGTGTGTGTTGCAGTGTTTGAAGAAGCTCGAGT 
GATTGCTTTTCTTCGGCATCAGCTGTGTTGGGAACATCTTGTCGTTAAAGTTTCGGAGTAATATTAGAGTAATGGAACGA 
AAAAAACAAAATAAAGTTCTGGAACCACAAAGATTTGAAAAATTGGGTAGAAACAAAAAAAAGACAAAGCAGGAACCCAA 
ATAAATG/iATAAACACTCAAAAACTACTCACAACAACAACACTTATTTTCACTTGCTTTATTTCTTCGATTTTTTatg 
ATCC/V^TTATCTCTAATAAAGAATACTAACTCACTTGTACATAGATCGCGTTTCCTAATTACAAAACCACAACTATA 
TACCTrATCGTCATTATATCCCATTCAAGAACATATTCAAGTCATTGTTAataTCAGATCAATCTCCATCTCCTAGTC 
AGCGATTCCCTTAGCTACACTACATTACATGAAAATTTGCCATCTCATTTCTTGGGTGGAAATTCAGTTTTGAATGCT 
ACCTTrr/iAAGTCAGAGACTTTGTCAGAGCTCATCAAGGTCATACAGTTATTTCGAAAATTTTAATTGCCAACAATGG 
TAGCToCACTTAAAGAAATCAGATCAGTTAGAAAATGGGCTTATGAAACATTTGGTGACGAAAAAcrrATnranT'T'ra 



TATACCT 
CTA 

GAACCT. 
20 TATAGCT 



,hMbftl 1 ^^aa^^AATGCCGAATATATTAGT^ATGGCCGACCAATTCATTGAAGTCCCTGGT 

TGGGTGGGGGCATGCTTCAGAGAATCCTTTGTTACCAGAAAAATTAGCTGCATCTCCCAAAAAAATTATTTTTATTGGTC 
CTCCTGGTTCA3CTA7GAGATCTTTAGGTGACAAGATTTCATCTACTATAGTTGCTCAACATGCTCAAGTACCATGTATT 
CCATGGTCCGGTACTGGTGTTGATGAAGTGAAAATAGACCCACAAACTAATTTGGTTTCTGTTfir:Tr;ATr:aTaTT'ra'rr:r* 



25 CCATG 
CAAAG 



LAAAOuGT - ~^ * ACTAGTCCAGAAGATGGTTTAGAAAAAGCCAAAT^AAATTGGGTTCCCAGTTATGATTAAAGCCTCTG 

^ G 5I GGTGGTGGTAAAGGTATTA ^ 
A ™ GGAGGTTCTCCTATCTTTATO ^^ 

™ GGG 1 A ;; TAACATTTCCCTTTTTG ^ 

CCATTGCCAGAAAGGAAACTTTCCACGAAATGGAAAATGCAGCAGTCAGATTGGGTAAATTAGTTnr:TT2iTr^ aT n^^ 



GGTACTG 
TCCAAC 
TC AG AG AT AT 



L TAACATT i CCCTTTTTGGAAGAGATTGTTCCGTACAAAGAAGACACCAAAAGATTATTGAAGAAGCACCAGTCA 
GCCAGAAAGGAAACTTTCCACGAAATGGAAAATGCAGCAGTCAGATTGGGTAAATTAGTTGGTTATGTATCCGCT 
TGTTGAGTATCTTTACTCCCACGCTGAAGATAAATTCTACTTTTTGGAATTGAACCCAAGATTGCAAGTTGAACA 
CCACTGAAATGGTGACAGGTGTTAATTTACCAGCTGCTCAATTACAAATTGCTATGGGTATACCAATGCATAGAA 
GATATCAGAACTTTGTACGGTGCCGATCCTCATACCACTACTGATATTGATTTTGAATTCAAGTCAGAAACTTCA 
TAGTC/iAAGAAGACCAACACCAAAGGGACATTGTACTGCTTGTCGTATTACTTCTGAAGATCCTGGTGAAGGTTT 
TAAACCAAGTGGTGGTTCTTTACATGAATTGAATTTCCGTTCTTCTTCTAATGTGTGGGGTTATTTCTCAGTTGGTAACC 
AATCTTCTATCCATTCATTTTCGGATTCTCAATTTGGTCATATTTTCGCATTTGGTGAAAACCGTCAAGCTTCAAGAAAA 
CATATGGTTGTTGCCTTGAAAGAATTGAGTATTAGAGGTGATTTTAGAACTACTGTTGAGTATTTAATCAAATTGTTAGA 
AACTCCAGATTTCGAGGATAATACCATTACAACTGGTTGGTTGGATGAATTAATCACCAAAAAGTTGACTGCTGAAAGAC 

An GAGA I CCAATAGTTGCTGTTGTT ^ 

4U CAATCTTTGGAAAAAGGTCAAGTTCCTCACAGAAACTTATTGAAAACTATTTTCCCAGTTGAGTTTATTTATGAAGGTGA 
AAGATACAAGTTCACTGCTACTAAATCTTCAGAAGATAAATATACTTTGTTCCTTAATGGTTCTCGTTGTGTTGTTGGTG 
CACGTTCATTGTCCGATGGTGGTTTATTGTGTGCATTAGATGGGAAATCACATTCTGTCTATTGGAAGGAAGAGGCATCT 
GCCACTAGATTATCAGTTGATGGCAAAJVCTTGTTTATTAGAAGTTGAAAATGATCCAACACAATTAAGAACTCCATCTCC 

A* AGG ^ TTGGTCAAGTATTTGGT ^^ 

43 AAATGTGTATGCCTTTGATTGCTCAAGAAAATGGGGTAGTGCAGTTGATTAAACAACCGGGTTCCACAGTTAATGCTGGT 

GATATCTTGGCCATTTTGGCATTGGACGATCCATCTAAGGTCAAACATGCTAAACCATTTGAAGGTACTTTACCATCTAT 

GGGTGAGCCAAATGTTACAGGTACTAAACCAGCACATAAATTCAATCATTGTGCTGGTATTTTGAAAAACATTTTGGCTG 

^H^ GATAATCAAGTGATTTTGAATTCTACTTTAAAGAGTC 
SO G ^I GGG ^ CAACAAATTTCAGCTTTACACTCCAGATT GCCACCT 

^ G * GAAAGTAGAGGTGCTGAATTC ^ 

ATATGTTAGAAGATGTTGTTGCACCATTGGTTTCTATTGCCACAAGTTACCAGAATGGTTTGGTTGAACACGAATACGAT 
TACTTTGCATCITTGATTAACGAATATTATGACGTTGAAAGTTTGTTTTCAGGTGAAAATGTTAGAGAAGATAATGTTAT 
CTTGAAATTAAGAGATGAAAACAAATCTGATTTGAAAAAAGTTATTGGTATTGGTTTGTCTCATTCACGTGTTAGTGCCA 
AG ^ GAATTTGATTTTAGCTATTT ^ 

S^S G r TAAAGAACTTGTTCATTAGACCTCGTGCTTGTGCCAAA ^ 

TTCTTTACCTTCCATCAAGGAAAGATCCGATCAATTGGAACATATTTTGAGGTCATCTGTTGTTCAAACCTCTTATGGTG 
AAATTTTTGCTAAACATAGAGAACCAAATTTGGAAATTATTCGTGAGGTTGTTGATTCCAAACATATTGTTTTTGATGTG 
TTGGCACAATTCTTAATCAATCCAGACCCATGGGTTGCCATTGCTGCCGCTGAAGTTTATGTCAGACGTTCAT ACCGTGT 
f>(\ " A ^ GATTTGGGTAAAATTGAATATCATGTTAATGA ^^ 

60 GAG ^ GGGTGGTGTAAACGA T G CTCAACAGGCTGCTGCTGCCGGTGGCGATGATTCGACATCTAT 

GTGTCTGATTTGACCTTTGTTGTTGATTCTAAAACCGAGCATTCCACAAGAACTGGTGTTTTAGCTCCAGCAAGACACTT 

n^ GATGTTGATGAAACTCTTACAGCTGCATTGGAACAATT 
^ G * GCAGAGTTATTAAATGTTTTG ^^ 

AGAATTAATGAAATCTTGTGCGAATACAAAGAAGAGTTGATTTCTGCTGGTGTTCGTCGTGTTACATTTGTTTTTGCTCA 
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TCAAATTGGTCAATATCCTAAATATTATACTTTTACTGGTCCTGACTATGAAGAAAACAAGGTTATTAGACACATTGAAC 
CAGCTTTGGCTTTCCAATTGGAATTGGGAAGATTAGCCAATTTCGATATCAAACCAATTTTCACTAACAACAGAAACATC 
CATGTATATGATGCAATTGGGAAGAATGCTCCTTCTGATAAAAGATTTTTCACCAGAGGGATTATTAGAACCGGTGTTCT 
TAAAGAAGACATTAGCATTAGTGAATATTTGATTGCTGAATCCAACAGATTAATGAATGATATTTTGGATACTTTAGAAG 
3 TTATTGACACTTCTAATTCTGATTTAAACCATATTTTCATTAACTTTTCCAATGCTTTCAATGTTCAAGCTTCAGATGTT 
GAGGCTGCCTTTGGATCATTCTTAGAAAGATTTGGTAGAAGATTATGGAGATTAAGAGTTACTGGTGCTGAAATTAGAAT 
TGTCTGTACTGATCCTCAAGGTACTTCGTTCCCATTGCGTGCTATCATTAATAATGTTTCTGGTTATGTTGTCAAATCAG 
AATTGTATTTGGAAGTGAAAAATCCTAAAGGTGAATGGGTTTTCAAATCCATTGGTCATCCTGGTTCCATGCATTTGAGA 
CCTATCTCAACTCCATATCCAGTTAAAGAATCTTTACAACCAAAACGTTACAAGGCTCACAATATGGGTACCACTTATGT 
1U GTATGACTTCCCAGAATTGTTTCGTCAAGCAACAATTTCACAATGGAAAAAATATGGCAAAAAAGTACCAAAAGATGTTT 
TCGTGTCTTTAGAATTGATCACTGATGAAACTGATTCCTTAATAGCTGTTGAAAGAGATCCGGGTGCTAACAAAATTGGA 
ATGGTTGGATTCAAAGTCACTGCTAAAACTCCTGAATACCCTCATGGTCGTCAATTAATTATTGTTGCCAATGATATCAC 
CCACAAGATTGGTTCTTTTGGTCCAGAAGAAGATAATTATTTCAACAAGTGTACTGAATTGGCCAGAAAATTAGGTATTC 
CAAGAATTTACCTTTCTGCAAATTCAGGTGCTAGAATTGGTGTTGCTGAGGAATTGATTCCATTATACCAAGTTGCCTGG 
I :> AATGAAGAAGGGTCTCCTGACAAAGGATTCAGATACTTGTACTTGAGTACTGCTGCTAAAGAGTCTTTAGAAAAAGATGG 
TAAAAGTGACAGTGTTGTTACTGAACGTATTGTTGAAAAAGGTGAAGAGCGTCATGTCATTAAAGCTATTATTGGTGCCG 
AAGATGGCTTAGGGGTTGAATGTCTTAAAGGATCAGGTTTAATTGCTGGTGCCACATCAAGAGCTTACAAGGATATATTT 
ACCATCACTTTGGTAACTTGTAGATCTGTTGGTATTGGTGCTTATTTGGTTAGATTGGGTCAAAGAGCCATTCAAATCGA 
TGGTCAACCTATTATTTTAACTGGTGCTCCTGCTATCAATAAATTGTTGGGTAGAGAAGTGTATTCTTCCAATCTTCAAT 
~U TGGGTGGTACTCAAATCATGTACAATAATGGTGTTTCTCATTTGACAGCTAATGATGATTTGGCTGGGGTTGAAAAAATT 
ATGGAATGGTTATCATATGTTCCAGCTAAACGTGGTTTACCAGTGCCAATTTTGGAATCAGAAGATTCTTGGGACAGAGA 
TCTTGATTACTACCCACCAAAACAAGAAGCTTTTGATGTTAGATGGATGATCCAAGGTAGAGAAGTTGATGGTGAATATG 
AATCTGGGTTATTTGATAAAGATTCATTCCAAGAAACATTATCTGGTTGGGCTAAAGGTGTTGTTGTTGGTAGAGCACGT 
TTGGGTGGTATTCCAATTGGTGTTATTGGTGTCGAAACCAGAACAGTGGAAAACTTGATTCCTGCTGATCCAGCAAATCC 
-3 AGACTCTACAGAAAGTTTGATTCAAGAAGCAGGTCAAGTGTGGTATCCTAACTCTGCTTTTAAGACAGCACAAGCTATAA 
ATGATTTCAACAATGGTGAACAATTGCCATTAATGATTTTAGCAAATTGGAGAGGTTTCTCTGGTGGTCAAAGAGATATG 
TACAATGAAGTCTTGAAATATGGTTCATTTATTGTTGATGCTTTAGTTGACTTCAAGCAACCTATCTTCACTTACATT.ee 
ACCAAATGGAGAATTGAGAGGTGGCTCTTGGGTTGTTGTTGATCCAACCATCAACTCAGATATGATGGAAATGTATGCCG 
ATGTCGATTCGAGAGCTGGTGTTTTGGAACCAGAAGGTATGGTTGGTATCAAATACAGACGTGATAAATTATTAGCAACT' 
ATGGAAAGATTAGATCCAACTTATGGTGAAATGAAAGCTAAGTTAAATGACTCGTCATTATCTCCAGAAGAACACTCGAA 
AATAAGCGCCAAATTGTTTGCACGTGAAAAGGCTTTATTACCAATTTATGCTCAAATTTCCGTTCAATTTGCTGACTTGC 
ACGATAGATCAGGTCGTATGTTGGCCAAGGGAGTTATTAGAAAGGAAATCAAATGGACTGATGCTAGACGTTTCTTCTTC 
TGGAGATTGAGAAGAAGATTGAACGAGGAATATGTTTTGAGATTGATTAGTGAACAAATTAAAGATTCTAGCAAATTGGA 
AAGAGTTGCCAGATTGAAGAGTTGGATGCCAACTGTTGAATACGATGATGACCAAGCTGTCAGTAACTGGATTGAAGAGA 
■JD ACCATGCC^JVATTGCAAAAGAGAGTTAATGAATTGAAACAAGT^AGTTTCAAGAACCAAGATTATGAGATTATTAAAAGAG 
GATCCAAATAGTGCAATTTCTGCAATGAAAGACTATGTTGAAAGATTGTCAAAAGAAGATAAAGAGAAATTCCTCAAGGC 
ATTGAAGtagAAGTGGTTTCCATTAATTCAACTTTTTAATGACATTGAAAGTAGTAGTAGTTGTTGTTTTTTAGATTTAA 
GTATATTATATTATGTAATAAATTATAGAAAGTAATTATAGTTTTGACGGTTAATTGACGAGAGTGGGAAATTGGCTTTT 
TTGTTGCTCGTGTGATGAAACAGTGATTGACACAAAAAAATAGACAATGAAAAC 
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FIGURE 5 

MRCKLSLIKNTNSLVHRSRFLITKPQLYIPHRHYIPFKNIFKSLLMSDQSPSPSPSDSLSYTTLHENLPSHFLGGNSVLN 
5 AEPSKVRDFVRAHQGHTVISKILIANNGIAAVKEIRSVRKWAYETFGDEKAIQFTVMATPEDLEANAEYIRMADQFIEVP 
GGTNNNNYANVDLIVEIAESTNAHAVWAGWGHASENPLLPEKLAASPKKIIFIGPPGSAMRSLGDKISSTIVAQHAQVPC 
IPWSGTGVDEVKIDPQTNLVSVADDIYAKGCCTSPEDGLEKAKKIGFPVMIKASEGGGGKGIRKVDDEKNFITLYNQAAN 
EIPGSPIFIMKLAGDARHLEVQLLADQYGTNISLFGRDCSVQRRHQKIIEEAPVTIARKETFHEMENAAVRLGKLVGYVS 
AGTVEYLYSHAEDKFYFLELNPRLQVEHPTTEMVTGVNLPAAQLQIAMGIPMHRIRDIRTLYGADPHTTTDIDFEFKSET 

JU SLVSQRRPTPKGHCTACRITSEDPGEGFKPSGGSLHELNFRSSSNVWGYFSVGNQSSIHSFSDSQFGHIFAFGENRQASR 
KHMWALKELSIRGDFRTTVEYLIKLLETPDFEDNTITTGWLDELITKKLTAERPDPIVAWCGAVTKAHIQAEEEKKEY 
IQSLEKGQVPHRNLLKTIFPVEFIYEGERYKFTATKSSEDKYTLFLNGSRCWGARSLSDGGLLCALDGKSHSVYWKEEA 
SATRLSVDGKTCLLEVENDPTQLRTPSPGKLVKYLVDSGEHVDAGQPYAEVEVMKMCMPLIAQENGWQLIKQPGSTVNA 
GDILAI LALDDPSKVKHAKPFEGTLPSMGEPNVTGTKPAHKFNHCAGILKNILAGYDNQVILNSTLKSLGEVLKDNELPY 

IS SEWQQQISALHSRLPPKLDDGLTALVERTQSRGAEFPARQILKLITKSIAENGNDMLEDWAPLVSIATSYQNGLVEHEY 
DYFASLINEYYDVESLFSGENVREDNVILKLRDENKSDLKKVIGIGLSHSRVSAKNNLILAILDIYEPLIiQSNSSVAASI 
REALKNLFIRPRACAKVALKAREILIQCSLPSIKERSDQLEHILRSSWQTSYGEIFAKHREPNLEIIREWDSKHIVFD 
VI^QFLINPDPWVAIAAAEVYVRRSYRAYDLGKIEYHVNDRLPIVEWKFKIANMGAAGVNDAQQAAAAGGDDSTSMKHAA 
SVSDLTFWDSKTEHSTRTGVLAPARHLDDVDETLTT^ALEQFQPADAISFKAKGETPELLNVLNIVITSIDGYSDENEYL 

ZU SRINEILCEYKEELISAGVRRVTFVFAHQIGQYPKYYTFTGPDYEENKVIRHIEPALAFQLELGRLANFDIKPIFTNNRN 
IHVYDAIGKNAPSDKRFFTRGIIRTGVLKEDISISEYLIAESNRLMNDILDTLEVIDTSNSDLNHIFINFSNAFNVQASD 
VEAAFGSFLERFGRRLWRLRVTGAEIRIVCTDPQGTSFPLRAIINNVSGYWKSELYLEVKNPKGEWVFKSIGHPGSMHL 
RPISTPYPVKESLQPKRYKAHNMGTTYVYDFPELFRQATISQWKKYGKKVPKDVFVSLELITDETDSLIAVERDPGANKI 
GMVGFKVTAKTPEYPHGRQLIIVANDITHKIGSFGPEEDNYFNKCTELARKLGIPRIYLSANSGARIGVAEELIPLYQVA 
WNEEGSPDKGFRYLYLSTAAKESLEKDGKSDSWTERIVEKGEERHVIKAIIGAEDGLGVECLKGSGLIAGATSRAYKDI 
FTITLVTCRS VG I GAYLVRLGQRAI QI DGQPI I LTGAPAINKLLGRE VYSSNLQLGGTQIMYNNGVSHLTANDDLAGVEK 
IMEWLSYVPAKRGLPVPILESEDSWDRDVDYYPPKQEAFDVRWMIQGREVDGEYESGLFDKDSFQETLSGWAKGVWGRA 
RLGGIPIGVIGVETRTVENLIPADPANPDSTESLIQEAGQVWYPNSAFKTAQAINDFNNGEQLPLMILANWRGFSGGQRD 
MYNEVLKYGSFIVDALVDFKQPIFTYIPPNGELRGGSWVWDPTINSDMMEMYADVDSRAGVLEPEGMVGIKYRRDKLLA 
TMERLDPTYGEMKAKLNDSSLSPEEHSKISAKLFAREKALLPIYAQISVQFADLHDRSGRMLAKGVIRKEIKWTDARRFF 

FWRLRRRLNEEYVLRLISEQIKDSSKLERVARLKSWMPTVEYDDDQAVSNWIEENHAKLQKRVNELKQEVSRTKIMRLLK 
EDPNSAISAMKDYVERLSKEDKEKFLKALK 
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SEQUENCE LISTING 



(1) GENERAL INFORMATION: 

5 

(i) APPLICANT: 

(A) NAME: Zeneca Ltd 

(B) STREET: 15 Stanhope Gate 

(C) CITY: London 

10 (D) STATE: Greater London 

(E) COUNTRY: England 

(F) POSTAL CODE (ZIP) : W1Y 6LN 

(G) TELEPHONE: 0171 304 5000 

(H) TELEFAX: 0171 304 5151 
15 (I) TELEX: 0171 834 2042 

(ii) TITLE OF INVENTION: PROCESS 

(iii) NUMBER OF SEQUENCES: 3 

20 

(iv) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS /MS-DOS 

25 (D) SOFTWARE: Patentln Release #1.0, Version #1.30 (EPO) 

(vi) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: GB 9726897.3 

(B) FILING DATE: 20-DEC-1997 
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(2) INFORMATION FOR SEQ ID NO: 1: 



(i) SEQUENCE CHARACTERISTICS: 
35 (A) LENGTH: 523 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

40 (ii) MOLECULE TYPE: other nucleic acid 



45 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 1: 

GCACGCTTGA CGGTTTTCAC CAAATGCGAA AATATGACCA AATTGAGAAT CCGAAAATGA 60 
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ATGGATAGAA GAT TGGTTAC CAACTGAGAA ATAACCCCAC ACATTAGAAG AAGAACGGAA 120 

ATTCAATTCA TGTAAAGAAC CACCACTTGG TTTAAAACCT TCACCAGGAT CTTCAGAAGT 180 

AATACGACAA GCAGTACAAT GTCCCTTTGG TGTTGGTCTT CTTTGACTAA CCAATGAAGT 24 0 

TTCTGACTTG AATTCAAAAT CAATATCAGT AGTGGTATGA GGATCGGCAC CGCACAAAGT 300 

5 TCTGATATCT CTGATTCTAT GCATTGGTAT ACCCATAGCA ATTTGTAATT GAGCAGCTGG 3 60 

TAAATTAACA CCTGTCACCA TTTCAGTGGT TGGATGTTCA ACTTGCAATC TTGGGTTCAA 4 20 

TTCCAAAAAG TAGAATTTAT CTTCAGCGTG GGGAGTAAAG GTACTCAACA GTACCAGGGG 4 80 

GTTACATAAC CAACTTATTT TACCCAATCT GACTGGTGGA TTT 523 

10 (2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 8054 base pairs 

(B) TYPE: nucleic acid 
15 (C) STRAN DEDNE SS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

20 



<xi) SEQUENCE: DESCRIPTION: SEQ ID NO: 2: 



25 AATATATTGC 
T CAT ATACGT 
CATGTAAAGA 
AAAAAAAACT 
TGGGTAATGG 
30 TTGGGAAATT 
CCTGCTTTTG 
TTCTTTTTCC 
TTCCCTTTCA 
CTAACGAATT 
35 GTTAGTTTGT 
AATCTGGAAT 
TGAAGCCATC 
GTGTTTGAAG 
GTCGTTAAAG 
40 GGAACCACAA 
CAATAAATGA 
TATTTCTTCG 
TACATAGATC 
TCCCATT CAA 
45 CTAGCGATTC 
GAAATTCAGT 
GTCATACAGT 



TTCCTTTTGA 
TCAATGGCTC 
GATTTGAAAC 
CCGTAAGGTC 
ACTAAGCTAT 
CATTTCATGC 
AAAAATCTTT 
CATATAACCC 
GTCTGAGGAA 
GATTATTTAC 
TCAATTATAA 
AATAAAAGTA 
GAGATAAGAA 
AAGCTCGAGT 
TTTCGGAGTA 
AGATTTGAAA 
ATAAACACTC 
ATTTTTTATG 
GCGTTTCCTA 
GAACATATTC 
CCTTAGCTAC 
TTTGAATGCT 
TATTTCGAAA 



TAGGAAGTAA CTCCGAGTGT 
TCTTCTATGC TTTGTATATA 
CATATTCTAA CCAACAAAAA 
CGCTTACACG GTTAAATTGA 
ATACAGTACT CAACAAAAAT 
AACTAGGGTG ATTCTCTTTC 
TCTAAATTCA AATTGATATA 
CATTTTTTTT TTGGAATCAT 
CATACTAATT ACGAACAACA 
ATTTATTAAA CCCTTGGATA 
GGGTATTATA CAACAAAGAT 
TTCAACACTT TTGCTTACAA 
ATTAAGCAAA AACGTTTACA 
GATTGCTTTT CTTCGGCATC 
ATATTAGAGT AATGGAACGA 
AATTGGGTAG AAACAAAAAA 
AAAAACTACT CACAACAACA 
AGATGCAAAT TATCTCTAAT 
ATTACAAAAC CACAACTATA 
AAGTCATTGT TAATGTCAGA 
ACTACATTAC ATGAAAATTT 
GAACCTTCTA AAGTCAGAGA 
ATTTTAATTG CCAACAATGG 



TTGAATTTGA TATATGTTAT 60 

CTTTCTTTTG AATAGATACT 120 

TATTGTACGG TATAGGTTAG 180 

AAACACGTTA AAAATATATT 24 0 

GAAATCAAAC ACAATGTTCT 300 

TACTAT CCAA CAACGATAAC 360 

ATTCTTATTT ATATATTACT 420 

ATTTGTTTTT GATTTTTGCT 4 80 

ATTATACATC CAATCTTCAT 54 0 

CAAACTGATT ACACTTTTTA 600 

ATCATTTAAA GTTAAATCTC 660 

TAGGTATGTT CAAAATCAAT 720 

ATTGTTGTGT GTGTGTTGCA 780 

AGCTGTGTTG GGAACATCTT 84 0 

AAAAAACAAA ATAAAGTTCT 900 

AAGACAAAGC AGGAACCCAA 960 

ACACTTATTT TCACTTGCTT 1020 

AAAGAATACT AACTCACTTG 1080 

TATACCTCAT CGTCATTATA 114 0 

TCAATCTCCA TCTCCTAGTC 1200 

GCCATCTCAT TTCTTGGGTG 1260 

CTTTGTCAGA GCTCATCAAG 1320 

TATAGCTGCA GTTAAAGAAA 1380 
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TCAGATCAGT TAGAAAATGG GCTTATGAAA CATTTGGTGA CGAAAAAGCC ATACAGTTTA 14 40 

CCGTTATGGC CACTCCAGAA GATTTGGAAG CTAATGCCGA ATATATTAGA ATGGCCGACC 1500 

AATTCATTGA AGTCCCTGGT GGCACCAATA ACAATAACTA TGCTAATGTT GATCTCATTG 1560 

TAGAGATAGC AGAAAGTACA AATGCTCATG CCGTTTGGGC TGGGTGGGGG CATGCTTCAG 1620 

5 AGAATCCTTT GTTACCAGAA AAATTAGCTG CATCTCCCAA AAAAATTATT TTTATTGGTC 1680 

CTCCTGGTTC AGCTATGAGA TCTTTAGGTG ACAAGATTTC ATCTACTATA GTTGCTCAAC 17 40 

ATGCTCAAGT ACCATGTATT CCATGGTCCG GTACTGGTGT TGATGAAGTG AAAATAGACC 1800 

CACAAACTAA TTTGGTTTCT GTTGCTGATG ATATTTATGC CAAAGGGTGC TGTACTAGTC 1860 

CAGAAGATGG TTTAGAAAAA GCCAAAAAAA TTGGGTTCCC AGTTATGATT AAAGCCTCTG 1920 

10 AAGGTGGTGG TGGTAAAGGT ATTAGAAAAG TTGATGATGA GAAAAACTTC ATTACCTTAT 1980 

ACAACCAAGC AGCTAATGAA ATACCAGGTT CTCCTATCTT TATTATGAAG TTAGCAGGTG 2040 

ATGCCAGACA TTTAGAAGTT CAATTACTAG CAGATCAATA CGGTACTAAC ATTTCCCTTT 2100 

TTGGAAGAGA TTGTTCCGTA CAAAGAAGAC ACCAAAAGAT TATTGAAGAA GCACCAGTCA 2160 

CCATTGCCAG AAAGGAAACT TTCCACGAAA TGGAAAATGC AGCAGTCAGA TTGGGTAAAT 2220 

15 TAGTTGGTTA TGTATCCGCT GGTACTGTTG AGTATCTTTA CTCCCACGCT GAAGATAAAT 2280 

TCTACTTTTT GGAATTGAAC CCAAGATTGC AAGTTGAACA TCCAACCACT GAAATGGTGA 234 0 

CAGGTGTTAA TTTACCAGCT GCTCAATTAC AAATTGCTAT GGGTATACCA ATGCATAGAA 24 00 

TCAGAGATAT CAGAACTTTG TACGGTGCCG ATCCT CAT AC CACTACTGAT ATTGATTTTG 2 4 60 

AATTCAAGTC AGAAACTTCA TTGGTTAGTC AAAGAAGACC AACACCAAAG GGACATTGTA 2520 

20 CTGCTTGTCG TATTACTTCT GAAGATCCTG GTGAAGGTTT TAAACCAAGT GGTGGTTCTT 2580 

TACATGAATT GAATTTCCGT TCTTCTTCTA ATGTGTGGGG TTATTTCTCA GTTGGTAACC 264 0 

AATCTTCTAT CCATTCATTT TCGGATTCTC AATTTGGTCA TATTTTCGCA TTTGGTGAAA 2700 

ACCGTCAAGC TTCAAGAAAA CATATGGTTG TTGCCTTGAA AGAATTGAGT ATTAGAGGTG 2760 

ATTTTAGAAC TACTGTTGAG TATTTAATCA AATTGTTAGA AACTCCAGAT TTCGAGGATA 2820 

25 ATACCATTAC AACTGGTTGG TTGGATGAAT TAATCACCAA AAAGTTGACT GCTGAAAGAC 2880 

CAGATCCAAT AGTTGCTGTT GTTTGTGGAG CTGTAACCAA AGCACACATC CAGGCTGAGG 294 0 

AAGAGAAAAA GGAATACATC CAATCTTTGG AAAAAGGTCA AGTTCCTCAC AGAAACTTAT 3000 

TGAAAACTAT TTTCCCAGTT GAGTTTATTT ATGAAGGTGA AAGATACAAG TTCACTGCTA ' 3060 

CTAAATCTTC AGAAGATAAA TATACTTTGT TCCTTAATGG TTCTCGTTGT GTTGTTGGTG 3120 

30 CACGTTCATT GTCCGATGGT GGTTTATTGT GTGCATTAGA TGGGAAATCA CATTCTGTCT 3180 

ATTGGAAGGA AGAGGCATCT GCCACTAGAT TATCAGTTGA TGGCAAAACT TGTTTATTAG 3240 

AAGTTGAAAA TGATCCAACA CAATTAAGAA CTCCATCTCC AGGTAAATTG GTCAAGTATT 3300 

TGGTTGACAG TGGTGAACAT GTTGATGCTG GTCAACCATA CGCTGAAGTC GAAGTTATGA 3360 

AAATGTGTAT GCCTTTGATT GCTCAAGAAA ATGGGGTAGT GCAGTTGATT AAACAACCGG 3420 

35 GTTCCACAGT TAATGCTGGT GATATCTTGG CCATTTTGGC ATTGGACGAT CCATCTAAGG 34 80 

TCAAACATGC TAAACCATTT GAAGGTACTT TACCATCTAT GGGTGAGCCA AATGTTACAG 354 0 

GTACTAAACC AGCACATAAA TTCAATCATT GTGCTGGTAT TTTGAAAAAC ATTTTGGCTG 3600 

GTTATGATAA TCAAGTGATT TTGAATTCTA CTTTAAAGAG TCTTGGTGAA GTTTTGAAAG 3660 

ACAATGAATT GCCATACTCT GAATGGCAAC AACAAATTTC AGCTTTACAC TCCAGATTGC 3720 

40 CACCTAAATT GGATGACGGA TTGACTGCAT TGGTTGAAAG AACTCAAAGT AGAGGTGCTG 3780 

AATTCCCTGC TCGTCAAATT TTAAAACTCA TCACCAAATC AATTGCTGAA AATGGTAATG 384 0 

ATATGTTAGA AGATGTTGTT GCACCATTGG TTTCTATTGC CACAAGTTAC CAGAATGGTT 3900 

TGGTTGAACA CGAATACGAT TACTTTGCAT CTTTGATTAA CGAATATTAT GACGTTGAAA 3960 

GTTTGTTTTC AGGTGAAAAT GTTAG AGAAG ATAATGTTAT CTTGAAATTA AGAGATGAAA 4020 

45 ACAAATCTGA TTTGAAAAAA GTTATTGGTA TTGGTTTGTC TCATTCACGT GTTAGTGCCA 4080 

AGAACAATTT GATTTTAGCT ATTTTGGACA TTTATGAACC ATTGTTGCAA TCCAACTCGT 4140 

CAGTTGCTGC CTCTATCAGA GAAGCTTTAA AGAACTTGTT CATTAGACCT CGTGCTTGTG 4200 
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CCAAAGTTGC ATTAAAGGCA AGAGAAATTT TAATTCAATG TTCTTTACCT TCCATCAAGG 4260 
AAAGATCCGA TCAATTGGAA CATATTTTGA GGTCATCTGT TGTTCAAACC TCTTATGGTG 4 320 
AAATTTTTGC TAAACATAGA GAACCAAATT TGGAAATTAT TCGTGAGGTT GTTGATTCCA 4380 
AACATATTGT TTTTGATGTG TTGGCACAAT TCTTAATCAA TCCAGACCCA TGGGTTGCCA 44 40 
5 TTGCTGCCGC TGAAGTTTAT GTCAGACGTT CATACCGTGC TTATGATTTG GGTAAAATTG 4 500 
AATATCATGT TAATGACAGA CTTCCTATTG TTGAATGGAA ATTCAAGTTG GCTAATATGG 4 560 
GAGCCGCTGG TGTAAACGAT GCTCAACAGG CTGCTGCTGC CGGTGGCGAT GATTCGACAT 4 620 
CTATGAAACA TGCAGCTTCT GTGTCTGATT TGACCTTTGT TGTTGATTCT AAAACCGAGC 4 680 
ATTCCACAAG AACTGGTGTT TTAGCTCCAG CAAGACACTT GGATGATGTT GATGAAACTC 474 0 
10 TTACAGCTGC ATTGGAACAA TTCCAACCAG CCGATGCTAT TTCATTTAAA GCAAAGGGTG 4 800 
AAACTCCAGA GTTATTAAAT GTTTTGAATA TTGTCATTAC CAGTATTGAT GGTTACTCCG 4 860 
ATGAAAATGA ATACTTGAGC AGAATTAATG AAATCTTGTG CGAATACAAA GAAGAGTTGA 4 920 
TTTCTGCTGG TGTTCGTCGT GTTACATTTG TTTTTGCTCA TCAAATTGGT CAATATCCTA 4 980 
AATATTATAC TTTTACTGGT CCTGACTATG AAGAAAACAA GGTTATTAGA CACATTGAAC 5040 
15 CAGCTTTGGC TTTCCAATTG GAATTGGGAA GATTAGCCAA TTTCGATATC AAACCAATTT 5100 
TCACTAACAA CAGAAACATC CATGTATATG ATGCAATTGG GAAGAATGCT CCTTCTGATA 5160 
AAAGATTTTT CACCAGAGGG ATTATTAGAA CCGGTGTTCT TAAAGAAGAC ATTAGCATTA 5220 
GTGAATATTT GATTGCTGAA TCCAACAGAT TAATGAATGA TATTTTGGAT ACTTTAGAAG 5280 
TTATTGACAC TTCTAATTCT GATTTAAACC ATATTTTCAT TAACTTTTCC AATGCTTTCA 534 0 

20 ATGTTCAAGC TTCAGATGTT GAGGCTGCCT TTGGATCATT CTTAGAAAGA TTTGGTAGAA 5400 

GATTATGGAG ATTAAGAGTT ACTGGTGCTG AAATTAGAAT TGTCTGTACT GATCCTCAAG 5460 

GTACTTCGTT CCCATTGCGT GCTATCATTA ATAATGTTTC TGGTTATGTT GTCAAATCAG 5520 

AATTGTATTT GGAAGTGAAA AATCCTAAAG GTGAATGGGT TTTCAAATCC ATTGGTCATC 5580 

CTGGTTCCAT GCATTTGAGA CCTATCTCAA CTCCATATCC AGTTAAAGAA TCTTTACAAC 564 0 

25 CAAAACGTTA CAAGGCTCAC AATATGGGTA CCACTTATGT GTATGACTTC CCAGAATTGT 5700 

TTCGTCAAGC AACAATTTCA CAATGGAAAA AATATGGCAA AAAAGTACCA AAAGATGTTT 5760 

TCGTGTCTTT AGAATTGATC ACTGATGAAA CTGATTCCTT AATAGCTGTT GAAAGAGATC 5820 

CGGGTGCTAA CAAAATTGGA ATGGTTGGAT TCAAAGTCAC TGCTAAAACT CCTGAATACC 5880 

CTCATGGTCG TCAATTAATT ATTGTTGCCA ATGATATCAC CCACAAGATT GGTTCTTTTG 594 0 

30 GTCCAGAAGA AGATAATTAT TTCAACAAGT GTACTGAATT GGCCAGAAAA TTAGGTATTC 6000 

CAAGAATTTA CCTTTCTGCA AATTCAGGTG CTAGAATTGG TGTTGCTGAG GAATTGATTC 6060 

CATTATACCA AGTTGCCTGG AATGAAGAAG GGTCTCCTGA CAAAGGATTC AGATACTTGT 6120 

ACTTGAGTAC TGCTGCTAAA GAGTCTTTAG AAAAAGATGG TAAAAGTGAC AGTGTTGTTA 6180 

CTGAACGTAT TGTTGAAAAA GGTGAAGAGC GTCATGTCAT TAAAGCTATT ATTGGTGCCG 6240 

35 AAGATGGCTT AGGGGTTGAA TGTCTTAAAG GATCAGGTTT AATTGCTGGT GCCACATCAA 6300 

GAG CTTACAA GGATATATTT ACCATCACTT TGGTAACTTG TAGATCTGTT GGTATTGGTG 6360 

CTTATTTGGT TAGATTGGGT CAAAGAGCCA TTCAAATCGA TGGTCAACCT ATTATTTTAA 6420 

CTGGTGCTCC TGCTATCAAT AAATTGTTGG GTAGAGAAGT GTATTCTTCC AATCTTCAAT 64 80 

TGGGTGGTAC TCAAATCATG TACAATAATG GTGTTTCTCA TTTGACAGCT AATGATGATT 6540 

40 TGGCTGGGGT TGAAAAAATT ATGGAATGGT TATCATATGT TCCAGCTAAA CGTGGTTTAC 6600 

CAGTGCCAAT TTTGGAATCA GAAGATTCTT GGGACAGAGA TGTTGATTAC TACCCACCAA 6660 

AACAAGAAGC TTTTGATGTT AGATGGATGA TCCAAGGTAG AGAAGTTGAT GGTGAATATG 6720 

AATCTGGGTT ATTTGATAAA GATTCATTCC AAGAAACATT ATCTGGTTGG GCTAAAGGTG 6780 

TTGTTGTTGG TAGAGCACGT TTGGGTGGTA TTCCAATTGG TGTTATTGGT GTCGAAACCA 6840 

45 GAACAGTGGA AAACTTGATT CCTGCTGATC CAGCAAATCC AGACTCTACA GAAAGTTTGA 6900 

TTCAAGAAGC AGGTCAAGTG TGGTATCCTA ACTCTGCTTT TAAGACAGCA CAAGCTATAA 6960 

ATGATTTCAA CAATGGTGAA CAATTGCCAT TAATGATTTT AGCAAATTGG AGAGGTTTCT 7020 
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CTGGTGGTCA AAGAGATATG TACAATGAAG TCTTGAAATA TGGTTCATTT ATTGTTGATG 7080 

CTTTAGTTGA CTTCAAGCAA CCTATCTTCA CTTACATTCC ACCAAATGGA GAATTGAGAG 714 0 

GTGGCTCTTG GGTTGTTGTT GATCCAACCA TCAACTCAGA TATGATGGAA ATGTATGCCG 7200 

ATGTCGATTC GAGAGCTGGT GTTTTGGAAC CAGAAGGTAT GGTTGGTATC AAATACAGAC 7260 

5 GTGATAAATT ATTAGCAACT ATGGAAAGAT TAGATCCAAC TTATGGTGAA ATGAAAGCTA 7320 

AGTTAAATGA CTCGTCATTA TCTCCAGAAG AACACTCGAA AATAAGCGCC AAATTGTTTG 7380 

CACGTGAAAA GGCTTTATTA CCAATTTATG CTCAAATTTC CGTTCAATTT GCTGACTTGC 74 4 0 

ACGATAGATC AGGTCGTATG TTGGCCAAGG GAGTTATTAG AAAGGAAATC AAATGGACTG 7500 

ATGCTAGACG TTTCTTCTTC TGGAGATTGA GAAGAAGATT GAACGAGGAA TATGTTTTGA 7560 

10 GATTGATTAG TGAACAAATT AAAGATTCTA GCAAATTGGA AAGAGTTGCC AGATTGAAGA 7620 

GTTGGATGCC AACTGTTGAA TACGATGATG ACCAAGCTGT CAGTAACTGG ATTGAAGAGA 7 680 

ACCATGCCAA ATTGCAAAAG AGAGTTAATG AATTGAAACA AGAAGTTTCA AGAACCAAGA 7740 

TTATGAGATT ATTAAAAGAG GATCCAAATA GTGCAATTTC TGCAATGAAA GACTATGTTG 7 800 

AAAGATTGTC AAAAGAAGAT AAAGAGAAAT TCCTCAAGGC ATTGAAGTAG AAGTGGTTTC 7860 

15 CATTAATTCA ACTTTTTAAT GACATTGAAA GTAGTAGTAG TTGTTGTTTT TTAGATTTAA 7 920 

GT AT ATT ATA TTATGTAATA AATTATAGAA AGTAATTATA GTTTTGACGG TTAATTGACG 7 980 

AGAGTGGGAA ATTGGCTTTT TTGTTGCTCG TGTGATGAAA CAGTGATTGA CACAAAAAAA 8040 

TAGACAATGA AAAC 8054 

20 (2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2270 amino acids 

(B) TYPE: amino acid 

25 {C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

30 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 

35 Met Arg Cys Lys Leu Ser Leu lie Lys Asn Thr Asn Ser Leu Val His 

15 10 15 

Arg Ser Arg Phe Leu lie Thr Lys Pro Gin Leu Tyr lie Pro His Arg 

20 25 30 

His Tyr He Pro Phe Lys Asn He Phe Lys Ser Leu Leu Met Ser Asp 
40 35 40 45 

Gin Ser Pro Ser Pro Ser Pro Ser Asp Ser Leu Ser Tyr Thr Thr Leu 

50 55 ' 60 

His Glu Asn Leu Pro Ser His Phe Leu Gly Gly Asn Ser Val Leu Asn 
65 70 75 80 

45 Ala Glu Pro Ser Lys Val Arg Asp Phe Val Arg Ala His Gin Gly His 

85 90 95 

Thr Val He Ser Lys He Leu He Ala Asn Asn Gly He Ala Ala Val 
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100 105 110 

Lys Glu He Arg Ser Val Arg Lys Trp Ala Tyr Glu Thr Phe Gly Asp 

115 120 125 

Glu Lys Ala He Gin Phe Thr Val Met Ala Thr Pro Glu Asp Leu Glu 
5 130 135 140 

Ala Asn Ala Glu Tyr He Arg Met Ala Asp Gin Phe He Glu Val Pro 
145 150 155 160 

Gly Gly Thr Asn Asn Asn Asn Tyr Ala Asn Val Asp Leu He Val Glu 
165 170 175 

10 He Ala Glu Ser Thr Asn Ala His Ala Val Trp Ala Gly Trp Gly His 

180 185 190 

Ala Ser Glu Asn Pro Leu Leu Pro Glu Lys Leu Ala Ala Ser Pro Lys 

195 200 205 

Lys lie lie Phe lie Gly Pro Pro Gly Ser Ala Met Arg Ser Leu Gly 
15 210 215 220 

Asp Lys He Ser Ser Thr lie Val Ala Gin His Ala Gin Val Pro Cys 
225 230 235 240 

He Pro Trp Ser Gly Thr Gly Val Asp Glu Val Lys lie Asp Pro Gin 
245 250 255 

20 Thr Asn Leu Val Ser Val Ala Asp Asp He Tyr Ala Lys Gly Cys Cys 

260 265 270 

Thr Ser Pro Glu Asp Gly Leu Glu Lys Ala Lys Lys lie Gly Phe Pro 

275 280 285 

Val Met lie Lys Ala Ser Glu Gly Gly Gly Gly Lys Gly He Arg Lys 
25 290 295 300 

Val Asp Asp Glu Lys Asn Phe lie Thr Leu Tyr Asn Gin Ala Ala Asn 
305 310 315 320 

Glu lie Pro Gly Ser Pro He Phe lie Met Lys Leu Ala Gly Asp Ala 
325 330 335 

30 Arg His Leu Glu Val Gin Leu Leu Ala Asp Gin Tyr Gly Thr Asn He 

340 345 350 

Ser Leu Phe Gly Arg Asp Cys Ser Val Gin Arg Arg His Gin Lys lie 

355 360 365 

lie Glu Glu Ala Pro Val Thr lie Ala Arg Lys Glu Thr Phe His Glu 
35 370 375 380 

Met Glu Asn Ala Ala Val Arg Leu Gly Lys Leu Val Gly Tyr Val Ser 
385 390 395 400 

Ala Gly Thr Val Glu Tyr Leu Tyr Ser His Ala Glu Asp Lys Phe Tyr 
405 410 415 

40 Phe Leu Glu Leu Asn Pro Arg Leu Gin Val Glu His Pro Thr Thr Glu 

420 425 430 

Met Val Thr Gly Val Asn Leu Pro Ala Ala Gin Leu Gin lie Ala Met 

435 440 445 

Gly lie Pro Met His Arg lie Arg Asp lie Arg Thr Leu Tyr Gly Ala 
45 450 455 460 

Asp Pro His Thr Thr Thr Asp lie Asp Phe Glu Phe Lys Ser Glu Thr 
465 470 475 480 
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Ser Leu Val Ser Gin Arg Arg Pro Thr Pro Lys Gly His Cys Thr Ala 

485 490 495 

Cys Arg lie Thr Ser Glu Asp Pro Gly Glu Gly Phe Lys Pro Ser Gly 
500 505 510 

5 Gly Ser Leu His Glu Leu Asn Phe Arg Ser Ser Ser Asn Val Trp Gly 

515 520 525 

Tyr Phe Ser Val Gly Asn Gin Ser Ser lie His Ser Phe Ser Asp Ser 

530 535 540 

Gin Phe Gly His lie Phe Ala Phe Gly Glu Asn Arg Gin Ala Ser Arg 
10 545 550 555 560 

Lys His Met Val Val Ala Leu Lys Glu Leu Ser He Arg Gly Asp Phe 

565 570 575 

Arg Thr Thr Val Glu Tyr Leu He Lys Leu Leu Glu Thr Pro Asp Phe 

580 585 590 

Glu Asp Asn Thr He Thr Thr Gly Trp Leu Asp Glu Leu He Thr Lys 

595 600 605 

Lys Leu Thr Ala Glu Arg Pro Asp Pro He Val Ala Val Val Cys Gly 

610 615 620 

Ala Val Thr Lys Ala His He Gin Ala Glu Glu Glu Lys Lys Glu Tyr 
20 625 630 635 640 

He Gin Ser Leu Glu Lys Gly Gin Val Pro His Arg Asn Leu Leu Lys 

645 650 655 

Thr lie Phe Pro Val Glu Phe He Tyr Glu Gly Glu Arg Tyr Ly* Phe 

660 665 670 

Thr Ala Thr Lys Ser Ser Glu Asp Lys Tyr Thr Leu Phe Leu Asn Gly 

675 680 685 

Ser Arg Cys Val Val Gly Ala Arg Ser Leu Ser Asp Gly Gly Leu Leu 

690 695 700 

Cys Ala Leu Asp Gly Lys Ser His Ser Val Tyr Trp Lys Glu Glu Ala 
30 705 710 715 720 

Ser Ala Thr Arg Leu Ser Val Asp Gly Lys Thr Cys Leu Leu Glu Val 

725 730 735 

Glu Asn Asp Pro Thr Gin Leu Arg Thr Pro Ser Pro Gly Lys Leu Val 

740 745 750 

Lys Tyr Leu Val Asp Ser Gly Glu His Val Asp Ala Gly Gin Pro Tyr 

755 760 765 

Ala Glu Val Glu Val Met Lys Met Cys Met Pro Leu lie Ala Gin Glu 

770 775 780 

Asn Gly Val Val Gin Leu lie Lys Gin Pro Gly Ser Thr Val Asn Ala 
40 785 790 795 800 

Gly Asp lie Leu Ala He Leu Ala Leu Asp Asp Pro Ser Lys Val Lys 

805 810 815 

His Ala Lys Pro Phe Glu Gly Thr Leu Pro Ser Met Gly Glu Pro Asn 

820 825 830 

Val Thr Gly Thr Lys Pro Ala His Lys Phe Asn His Cys Ala Gly lie 

835 840 845 

Leu Lys Asn lie Leu Ala Gly Tyr Asp Asn Gin Val lie Leu Asn Ser 



25 



35 
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850 855 860 

Thr Leu Lys Ser Leu Gly Glu Val Leu Lys Asp Asn Glu Leu Pro Tyr 
865 8 7 u 8 75 8 80 

Ser Glu Trp Gin Gin Gin lie Ser Ala Leu His Ser Arg Leu Pro Pro 

885 890 895 

Lys Leu Asp Asp Gly Leu Thr Ala Leu Val Glu Arg Thr Gin Ser Arg 

900 905 910 

Gly Ala Glu Phe Pro Ala Arg Gin He Leu Lys Leu He Thr Lys Ser 

915 920 925 

He Ala Glu Asn Gly Asn Asp Met Leu Glu Asp Val Val Ala Pre Leu 

930 935 940 

Val Ser lie Ala Thr Ser Tyr Gin Asn Gly Leu Val Glu His Glu Tyr 
945 9 50 955 960 

Asp Tyr Phe Ala Ser Leu He Asn Glu Tyr Tyr Asp Val Glu Ser Leu 

965 970 975 

Phe Ser Gly Glu Asn Val Arg Glu Asp Asn Val He Leu Lys Leu Arg 

980 985 990 

Asp Glu Asn Lys Ser Asp Leu Lys Lys Val He Gly He Gly Leu Ser 

"5 1000 1005 

His Ser Arg Val Ser Ala Lys Asn Asn Leu He Leu Ala He Leu Asp 

1010 1015 1020 

He Tyr Glu Pro Leu Leu Gin Ser Asn Ser Ser Val Ala Ala Ser He 
1025 1030 1035 1040 

Arg Glu Ala Leu Lys Asn Leu Phe He Arg Pro Arg Ala Cys Ala Lys 

1045 1050 1055 

Val Ala Leu Lys Ala Arg Glu He Leu He Gin Cys Ser Leu Pro Ser 

1060 1065 1070 

He Lys Glu Arg Ser Asp Gin Leu Glu His He Leu Arg Ser Ser Val 

1075 1080 1085 

Val Gin Thr Ser Tyr Gly Glu He Phe Ala Lys His Arg Glu Pro Asn 

1°90 1095 iioo 

Leu Glu He lie Arg Glu Val Val Asp Ser Lys His He Val Phe Asp 
1105 IHO 1115 1120 

Val Leu Ala Gin Phe Leu He Asn Pro Asp Pro Trp Val Ala lie Ala 

1125 1130 H35 

Ala Ala Glu Val Tyr Val Arg Arg Ser Tyr Arg Ala Tyr Asp Leu Gly 

II 40 1145 H50 

Lys He Glu Tyr His Val Asn Asp Arg Leu Pro He Val Glu Trp Lys 

H55 H60 H65 

Phe Lys Leu Ala Asn Met Gly Ala Ala Gly Val Asn Asp Ala Gin Gin 

ll 70 H75 H80 

Ala Ala Ala Ala Gly Gly Asp Asp Ser Thr Ser Met Lys His Ala Ala 
1185 "90 1195 120 0 

Ser Val Ser Asp Leu Thr Phe Val Val Asp Ser Lys Thr Glu His Ser 

1205 1210 1215 

Thr Arg Thr Gly Val Leu Ala Pro Ala Arg His Leu Asp Asp Val Asp 
1220 1225 1230 
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Glu Thr Leu Thr Ala Ala Leu Glu Gin Phe Gin Pro Ala Asp Ala He 

1235 1240 1245 

Ser Phe Lys Ala Lys Gly Glu Thr Pro Glu Leu Leu Asn Val Leu Asn 

1250 1255 126O 

lie Val He Thr Ser He Asp Gly Tyr Ser Asp Glu Asn Glu Tyr Leu 
1265 1270 1275 ^ 1280 

Ser Arg He Asn Glu He Leu Cys Glu Tyr Lys Glu Glu Leu He Ser 

1285 1290 1295 

Ala Gly Val Arg Arg Val Thr Phe Val Phe Ala His Gin He Gly Gin 

1300 1305 1310 

Tyr Pro Lys Tyr Tyr Thr Phe Thr Gly Pro Asp Tyr Glu Glu Asn Lys 

1315 1320 1325 

Val He Arg His He Glu Pro Ala Leu Ala Phe Gin Leu Glu Leu Gly 

1330 1335 134Q 

Arg Leu Ala Asn Phe Asp He Lys Pro He Phe Thr Asn Asn Arg Asn 
1345 13 *0 1355 136O 

lie His Val Tyr Asp Ala He Gly Lys Asn Ala Pro Ser Asp Lys Arg 

1365 1370 1375 

Phe Phe Thr Arg Gly He He Arg Thr Gly Val Leu Lys Glu Asp lie 

1380 1385 1390 

Ser He Ser Glu Tyr Leu He Ala Glu Ser Asn Arg Leu Met Asn Asp 

1395 1400 1405 

He Leu Asp Thr Leu Glu Val He Asp Thr Ser Asn Ser Asp Leu Asn 

1410 1415 1420 

His He Phe He Asn Phe Ser Asn Ala Phe Asn Val Gin Ala Ser Asp 
1425 1430 1435 1440 

Val Glu Ala Ala Phe Gly Ser Phe Leu Glu Arg Phe Gly Arg Arg Leu 

1445 1450 1455 

Trp Arg Leu Arg Val Thr Gly Ala Glu lie Arg lie Val Cys Thr Asp 

1460 1465 1470 

Pro Gin Gly Thr Ser Phe Pro Leu Arg Ala lie lie Asn Asn Val Ser 

1475 1480 1485 

Gly Tyr Val Val Lys Ser Glu Leu Tyr Leu Glu Val Lys Asn Pro Lys 

1490 1495 1500 

Gly Glu Trp Val Phe Lys Ser lie Gly His Pro Gly Ser Met His Leu 
1505 "10 1515 1520 

Arg Pro lie Ser Thr Pro Tyr Pro Val Lys Glu Ser Leu Gin Pro Lys 

1525 1530 1535 

Arg Tyr Lys Ala His Asn Met Gly Thr Thr Tyr Val Tyr Asp Phe Pro 

1540 -1545 1550 

Glu Leu Phe Arg Gin Ala Thr lie Ser Gin Trp Lys Lys Tyr Gly Lys 

1555 1560 1565 

Lys Val Pro Lys Asp Val Phe Val Ser Leu Glu Leu lie Thr Asp Glu 

1570 1575 158O 

Thr Asp Ser Leu lie Ala Val Glu Arg Asp Pro Gly Ala Asn Lys lie 
1585 1590 1595 1600 

Gly Met Val Gly Phe Lys Val Thr Ala Lys Thr Pro Glu Tyr Pro His 
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1605 1610 1615 

Gly Arg Gin Leu lie He Val Ala Asn Asp lie Thr His Lys He Gly 

1620 1625 i 6 3o 

Ser Phe Gly Pro Glu Glu Asp Asn Tyr Phe Asn Lys Cys Thr Glu Leu 

1635 1640 1645 

Ala Arg Lys Leu Gly He Pro Arg lie Tyr Leu Ser Ala Asn Ser Gly 

1650 1655 1660 

Ala Arg He Gly Val Ala Glu Glu Leu He Pro Leu Tyr Gin Val Ala 
1665 167 ° 1675 1680 

Trp Asn Glu Glu Gly Ser Pro Asp Lys Gly Phe Arg Tyr Leu Tyr Leu 

l g 85 1690 1695 

Ser Thr Ala Ala Lys Glu Ser Leu Glu Lys Asp Gly Lys Ser Asp Ser 

l" 700 1705 1710 

Val Val Thr Glu Arg He Val Glu Lys Gly Glu Glu Arg His Val He 

1715 1720 1725 

Lys Ala He He Gly Ala Glu Asp Gly Leu Gly Val Glu Cys Leu Lys 

1730 1735 i7 40 

Gly Ser Gly Leu He Ala Gly Ala Thr Ser Arg Ala Tyr Lys Asp He 

1745 1750 nee 

±,ou 1755 1760 

Phe Thr He Thr Leu Val Thr Cys Arg Ser Val Gly He Gly Ala Tyr 

1765 1770 1775 

Leu Val Arg Leu Gly Gin Arg Ala He Gin He Asp Gly Gin Pro He 

1780 1785 i7 9 o 

He Leu Thr Gly Ala Pro Ala He Asn Lys Leu Leu Gly Arg Glu Val 

1795 1800 1805 

Tyr Ser Ser Asn Leu Gin Leu Gly Gly Thr Gin He Met Tyr Asn Asn 

18 1° 1815 1820 

Gly Val Ser His Leu Thr Ala Asn Asp Asp Leu Ala Gly Val Glu Lys 
1825 18 30 1835 1840 

He Met Glu Trp Leu Ser Tyr Val Pro Ala Lys Arg Gly Leu Pro Val 

1845 1850 1855 

Pro He Leu Glu Ser Glu Asp Ser Trp Asp Arg Asp Val Asp Tyr Tyr 

I860 1865 1870 

Pro Pro Lys Gin Glu Ala Phe Asp Val Arg Trp Met He Gin Gly Arg 

1875 i 8 80 less 

Glu Val Asp Gly Glu Tyr Glu Ser Gly Leu Phe Asp Lys Asp Ser Phe 

18 90 1895 1900 

Gin Glu Thr Leu Ser Gly Trp Ala Lys Gly Val Val Val Gly Arg Ala 
1905 1910 1915 ' 1920 

Arg Leu Gly Gly He Pro He Gly Val He Gly Val Glu Thr Arg Thr 

1925 1930 1935 

Val Glu Asn Leu He Pro Ala Asp Pro Ala Asn Pro Asp Ser Thr Glu 

1940 1945 1950 

Ser Leu He Gin Glu Ala Gly Gin Val Trp Tyr Pro Asn Ser Ala Phe 

1955 i960 1965 

Lys Thr Ala Gin Ala He Asn Asp Phe Asn Asn Gly Glu Gin Leu Pro 
197 0 1975 1980 
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Leu Met lie Leu Ala Asn Trp Arg Gly Phe Ser Gly Gly Gin Arg Asp 
1985 19 *° 1995 2000 

Met Tyr Asn Glu Val Leu Lys Tyr Gly Ser Phe He Val Asp Ala Leu 

2005 2010 2015 

Val Asp Phe Lys Gin Pro He Phe Thr Tyr He Pro Pro Asn Gly Glu 

2020 2025 2030 

Leu Arg Gly Gly Ser Trp Val Val Val Asp Pro Thr He Asn Ser Asp 

2035 2040 2045 

Met Met Glu Met Tyr Ala Asp Val Asp Ser Arg Ala Gly Val Leu Glu 

2050 2055 2060 

Pro Glu Gly Met Val Gly He Lys Tyr Arg Arg Asp Lys Leu Leu Ala 
2065 2070 2075 2080 

Thr Met Glu Arg Leu Asp Pro Thr Tyr Gly Glu Met Lys Ala Lys Leu 

2085 2090 2095 

Asn Asp Ser Ser Leu Ser Pro Glu Glu His Ser Lys He Ser Ala Lys 

2100 2105 2110 

Leu Phe Ala Arg Glu Lys Ala Leu Leu Pro He Tyr Ala Gin He Ser 

2H5 2120 2125 

Val Gin Phe Ala Asp Leu His Asp Arg Ser Gly Arg Met Leu Ala Lys 
20 2130 2135 2140 

Gly Val He Arg Lys Glu He Lys Trp Thr Asp Ala Arg Arg Phe Phe 
2145 2150 2155 2160 

Phe Trp Arg Leu Arg Arg Arg Leu Asn Glu Glu Tyr Val Leu Arg Leu 

2165 2170 2175 

He Ser Glu Gin He Lys Asp Ser Ser Lys Leu Glu Arg Val Ala Arg 

2180 2185 2190 

Leu Lys Ser Trp Met Pro Thr Val Glu Tyr Asp Asp Asp Gin Ala Val 

2195 2200 2205 

Ser Asn Trp He Glu Glu Asn His Ala Lys Leu Gin Lys Arg Val Asn 

2210 "15 2220 

Glu Leu Lys Gin Glu Val Ser Arg Thr Lys He Met Arg Leu Leu Lys 

2225 2230 •>•>■>* 

2235 2240 

Glu Asp Pro Asn Ser Ala He Ser Ala Met Lys Asp Tyr Val Glu Arg 

2245 2250 2255 

Leu Ser Lys Glu Asp Lys Glu Lys Phe Leu Lys Ala Leu Lys 
22fi 0 2265 2270 
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